Abstract: Normal language, whether spoken, written, or typed, makes up much of human communication. A colossal quantity of this language describes the visible world either straight around us or in snapshots and video. This paper reviews a system to mechanically generate ordinary language descriptions from the video and a procedure that generates natural language descriptions for video. The framework is divided into two sections known as training and testing part. The training part is used to train the video with its description like pursuits of objects in that video. The testing part is used to experiment the video and retrieve the output as description of video evaluating videos stored into database. Combining Natural-language processing (NLP) with computer vision to generate English descriptions of visual information is an important area of active research.
Keywords: Natural-language processing (NLP), computer vision, video evaluation, language descriptions, Video processing.